Biostatistics For Dummies (Monika Wahi John Pezzullo)

A variable can refer to one value or to a collection of values called arrays. Arrays can come with one

or more dimensions.

One-dimensional arrays

A one-dimensional array can be thought of as a list of values. For instance, you may record a list of

fasting glucose values (in milligrams per deciliter,

) from five study participants as 86, 110, 95,

125, and 64. You could use the variable name Gluc to refer to this array containing five numbers, or

elements. Using the term Gluc in a formula refers to the entire five-element array.

You can refer to one particular element of this array (meaning one glucose measurement) in several

ways. You can use the index of the array, which is the number that indicates the position of the element

to which you are referring in the array.

In a typeset formula, indices are typically indicated using subscripts. For example,

refers to

the third element in the array (which would be 95 in our example).

In a plain text formula, indices are typically indicated using brackets (such as Gluc[3]).

The index can be a variable like I, so Gluc[i] would refer to the ith element of the array. The term ith

means the variable would be allowed to take on any value between 1 and the maximum number of

elements in the array (which in this case would be 5).

In some programming languages and statistical books and articles, the indices start at 0 for the

first element, 1 for the second element, and so on, which can be confusing. In this book, all arrays

are indexed starting at 1.

Higher-dimensional arrays

Two-dimensional arrays can be understood as a table of values with rows and columns, like a block of

cells in a spreadsheet. There are also higher-dimensional arrays that can be thought of as a whole

collection of tables. Suppose that you measure the fasting glucose on five participants on each of three

treatment days. You could think of your 15 measurements being laid out in a table with five rows and

three columns. If you want to represent this entire table with a single variable name like Gluc, you can

use double-indexing, with the first index specifying the participant (1 through 5), and the second index

specifying the day of the measurement (1 through 3). Under that system, Gluc[3,2] indicates the fasting

glucose measurement for participant 3 on day 2. To express the array as a formula, we would use the

expression Gluc[i,j], which specifies the fasting glucose for the ith subject on the jth day.

Special terms may be used to refer to arrays with one or two dimensions:

A one-dimensional array is also referred to as a vector. But this can be confusing, because the term

vector is also used in mathematics, physics, and biology to refer to completely different concepts.

A two-dimensional array is sometimes called a matrix (plural: matrices). To some, this term